DOCDHEHT BESDIE. 



ED 324, 098 

IDTHOB 
TITLE •* 

INSTITUTION 

POB DAJE 
NOTE 



EliRS PRICE 
DESCEIPTORS 



E2 007 -^98 



Chidambaaram, T.' S. * * * ' I' ^ ' - ' r f 
Enrollmen.t Forecasting in an Open \&dmissionsV 
Environ me ntl , , _ 

Federal City Goll./ i^ashington/ 3).C. Office 'of 
Institutional Be^arch.' , * • > * 

74; . , • . . \ . • . ^ * . . 

5Up,; Not available in, hard copy due to margineil 
legibility of original document. 

MF-"$0.83 Plus Postage.- HC 'Not- Available from EDBS. 
CollegLe Freshmen; Educational Planning; ^Enrollment 
Projections; ^Enrollment Rate; *Higher Education; 
'Institutional Research; *fl^thematical Hodels; ^=^Open ' 
Enrollment:; Post Secondary Education; Prediction; 
Predictor Variables.; Public Schools; Student 
' Enrollment ^ • ' * - 

Federal City C^l^e^e * ' ' \^ 



IDENTIF4^P.S 
ABSTRACT ^ 

Developing a^ model for predicting demand -for £reshaerf 
reguirement 'courses (from freshmen enrollees and from .returned 
enrollees who failed to ^complete thd course in their previous \v* 
quarters) is the objective of ' the Freshmen Requirement Study, now 
partially completed by Federal clt/ College. Wor:k done so far ha^ . 
essentially validated an initial approach to the' problem wl^ere past 
enrollment behavio£ has been taken as- ^ pr.edicti^e sfactorl 'Analysis 
of data specially compiled f or ,the st:udy shoifs that 'the^ more"" 

numerous and. more recent a student's enrollme^^ has beeix in the past, 
the higher his probability of return is. The difference between the 
summer garter ai^^d other guarters has^ been documented.. A model thkt 
fits well the observed data on return probabilitie's has been 
constructed and, using it, the effects of the various components of^ 
past behavio.r affecting return probabilities have been measured. 
Future work will validate the model with more recent data, fqrtify 
it> if necessary, with other explanatory variables, and put it into 
operation, setting up systems for data Collection analysis. 
i-Appendices discuss mathematics of probabilistic pre^iptive models: 
Operationalizing Predictive HodeJ.s and- Fitting' ah Additive' Model to 
Logarithm of Refenrollment Pr.obabilities. (iAuthor/«3T) 



* Documents acgul^red by E^IC inqlude many informal unpublished * 
materials npt available froHi oth^ir sources. E.RIC makes every effort * 

* to obtain the best copy available. Nevertheless, items- of ^ marginal 

♦ .reproducibility are ofte-n encountered and this .i^ffeets the quality^ 
*• of the*^ microfiche ^nd hardcopy reproductions ERIC makes'available 

* 'via thfe ERIC Document Reproduction Service (^DRS) . EDRS is not 

♦ resppnsible for the gualTTty of the original documen-^. Reproductions ♦ 

♦ supplied by EDRS are, the best that can be made from the original.. ' ♦ 



7 



»■ 



E>iroi1merit^ Forecasting In An 
Open Admissiorrs [!rivironnio|it* 



OC;cy ^ Or ' 



T. S. Ch'idamboranu Ph.D. 
Ken Robert Gi"amza. 5 Director 



Office of Iilstitutional Rt^s-earch 
Federal c\ty Coll eg?. 



*To. be pres^ented at the 74th Annual'^Meeting of tjie American Educational 
Research' Association 5 ApriT 15-19, 1974, at Chicago^ -Illinois. 



ENROLLMENT FORECASTING IN AN 'OPEN ADMISSIO'NS ENVIRoI^NT ■> 



-TO, PRESENT A GENERAL SCHEF£ APPLICABLE "FOR FORECASTING 
THE NUMBER OF STUDENTS RETURNING TO SCHOOL FROf'l PREVIOUS, 
QUARTERS/SEMESTERS'- - _ • - 

IN -AN OPEN ADMISSIONS ENV.IROI^NT, SYUD^NTS DROP IN AND OUT 
AT WILL, mKINS THIS PREDICTION IMPORTAW FOR SUCH fASKS "AS 

0 COURSE SCHEDULING * • ' * ^ 

© ESTIMATII'-X; GRADUATE OUTPUT ' . . 

© R&CRUIT.ME;fr TARGET DETEP>1INATI0N • . ' ' • ; 



K 



■ / 





•EXHIBIT 2' - 

•J ■ ' 



O' TABLE 1 SHOWS TH^T ABOUT 22% OF STUDBvITS ENROLLED IN ONE 
•QUARTER C>0 WOT. RETURN NEXT" QU.ARTER BUT ABOUT OF THESE ' 
'dropouts DO REJli^N TO SCHOOL IN LATE'r QUARTERS. 

PERCENTAGE RETURNII^iG AFTER FOUR QUARTERS OF ABSENCE I S • SiW- . 

I ' 

0 Pf^EVIOUS^STUblES INDfCA'TE FUT-ILITY OF US I K-G" SOCIOECONOMIC 
VARIABLE^ TO PREDICT DROPOUT BEHAVIOR. ." > <s 

• - -r^ ' ' ' ' 

APPROACH TAKEN HERE IS TO USE PAST ENROLLMENT BEHAVIOR ITSELF 
- AS PREQICTOR VARIABLE. SPECIFICALLY ErNROLLh'£NT IN PREVIOUS 
' FOLTR QUARTERS IS U^ED. ' ' ^ , 




EXHIBIT 3 '• 



p DEFN: AFFILIATED 'STUDENT IN QUARTER Q IS A STUDENT WHO WKS 
ENROLLED IhT AT LEAsT ONE" OF THE -FOUR QUARTERS PRECEDING Q). 



© EACh AFFILIATED STUDENT, tAi^! BE ASSIGTCD TO ONE OF FIFTEEN 
ENROLLMENT PATTERNS CREPRES€(^ED BY A FOU^ DIGIT BIhiARY 

number) depending on his enrollment' pattern in last four 
"■quarters; ' ■ • ■ 



•t 



qUART£R 


f 


QUARTER 




QUARTER 




QUARTER 


♦ 


QUARTER 


■q-4|. 


• 


Q-3 '• 




Q-2 




V ■ 


[ 





t 



FOR A STUDENT ENROLLED ONLY 
IN QUARTER Q-2 



FOR ,A STUDENT ENROILEO IK 
QTRS Q-'4 & Q-3 ONLY 



EXHIBIT 



e TABLE 2 SHOWS THE* RETURN PROBABILITIES FOR STUbENTS. WITH VARIOUS. 
ENROLMENT PATTERh^S. SUMMER QUARTER IS ;OBVlOUSLY DIFFERENT AND" 
some' PAHERNS have higher RETURhl PROBABILITIES. " ' ' . 

0 TWO-WAY ANALYSIS OF VARIANCE (TABlE 3) SHOWS SIGNIFICANT 

INTERACTION BETWEEN QUARTERS A'^D PATTERNS EVEN AFTER- REMOVING 
SUf^ER QUAflTER. THIS IMPLIES TH^T THE DI FFERENCE 'BETWEEN 
PATTERNS IS NOT UNIFOPi^ FOR ALL REGULAR QUARTERS (I.E.,, FALL, 
WINTEk AND SPRING). EXAMPLE: 0101 AND 0110. vHENCE PATTERN 
DIFFERENCES APsE OBSCURED BY THIS INTERACTION. ,. ' 

t - 

© CONSISTANT PATTERN DIFFERENCES EMERGE (TABUE ^i) ON - ' 
REARRANGEKiEtvIT OF TABLE 2 DATA BY CONSIDERING PATTERNS BASED 
OJnJ SUMMER QUARTERS SEPARATELY. 



-4- 



6" 



•EXHIBIT 5 



JABLE 4 SHOWS THAT ^- ' 

' 0 ENROLLMEIfT IN SWAEK ALV/AYS INCREASES RETURN PROBA^LLITY 

•* , " — 

S'THE return' pro B/^ I LI TY INCREASES WITH THE NUMBER OF 
QUARTERS ONE ATTENDS. , 

QME RECENT THE ENROLU-'.pNT EXPERIENCE IN A "feEGUL 
.-"THE RICHER' THE RETuRf^ PROBABILITY 




E^JROLLMENT PATTSftPfS^WVtTBEBvJ ARRAf^^^ED IN TABLE 4 ACCORDING 
JHE ABOVE HPj'POTHESES'AND PRODUCES A STRICKIf>JGLY CONSISTENT 
PICTURE. ■ . • ' 



7 



-5- 



ERIC 



1 



EXHIBIT -6 



• TO QUANT! tATlVELY ESTIMATE THE EFFECT OF PREVIOUS ENROLLMENT 
A 17 PARAMETER MODEL OF THE /OLLWfNG TYPE WAS-flT TO THE - 

• TABLE 2 DATA 

LNa-P)= MfBj+B^+B^+Bj^+E 

V/HERE ' • ■ ■ 

-P = RETURN PROBABILITY- IN QUARTER Q 

M = GENERAL MEAN 

Bi = EFFECT OF ENR'OLLMENT IN QUARTER Q-1 

"Bp = EFFECT OF ENROLLMEMT IN QUARTER Q-2 . 

Bo = EFFECT OF ENROLLi^ENT IN QUARTER Q-3 

-B^ = EFFECT OF ENROLLMENT IN QUARTER Q-4 

0 ACTUAL MOKL DI^INGUISHED BET\VEEN REGULAR SWER 
" 1QUAP.TERS* AND POSTULATED SEPARATE ENROLLMENT AND DROPOUr 
J EFFECTS 'IN EACH QUARTER. 



EXHIBIT 7 



• NDDEL FlY THE OBSERVED DATA WELL EXPLAINING ^0% OF VARIATIQNI 
.(.SEE TABLE 5) , 

• THb Least square estimates .of parameter. values were' 



.regular 






SUMMER 


If r 




-1.05 




■^1 = 


-0.8't 


^2 = 


-0.48' 






-0.49 

> 


^3 : 


-0.28 






-0.16 


\ = 


-0.2^8 


J 




-0.15 



© THE PARA^'cTE^ VALVJES SUPPORT tHE PREVld(jS_ QUALITATP/E 

inferences. ' ' . • . '■ , ' 



EXHIBIT 7A 



this' ANALYSIS .SU3STANTIATES THE VALIDITY OF USING PAST ENROLLMENT- 
HISTORY FOR FORECASTING RETURNIt^JG STUDENTS " . 



GENEli^L FORECftSTING SCHEME 




AFFILIATED 
• POPULATION 
IN QUARTER Q 





FvETJJRIilN 



PROJECTED TOTAL) 
RETURNEES IN 
QUARTER Q 



BRE,OKDaJN INTO 15 
SUBSETS --BASED O^l . 
PREVIOUS ENROLLf^ENT 



APPLY RETUTjN PROB 
APPROPRIATE FOR 
QUARTER Q TO EAOH. 
SUBSET 



J 



10 



EXHIBIT 8 



FORECASTING TOTAL RETURNING ENROLLEES AT FCC 
« ^ DURING FALL, "WTNTER~AND"3PRING 1972-73 '• 



QUARTER , 



FALL 1972.-73" WINTER 1972-73 SPRING 1972-73 



ACTUAL TOTAL 
RETURNEES 


5708 


""5998 


6158 

> 


f«DEL PREDICTION 
STD ERROR 


5968 (5895)* . 
'283 ( 305) 


■ 647§ V6319)* 
270 ( 295)' 


6298 (6211) 
• 267 ( 290) 


DEVIATION. 
DEVIATION/SE 


+ 260 (+187) 
0.92 (0.61 ) 


+480 (+321) 
1.8 (1.1 ) 


+140 ("+53) 
0.52 (0.18 ) , ' 



♦FIGURED 'IN PARENTHESES WERE OBTAINE'D BY USING RETURN PROBABILITIES 
EST^IMATED BY MODEL AS GIVEN IN TABLE 5. 



EXHIBIT 9" 



CONCLUSIONS 



Tl€ FORECASTING SCHEi^E Cm BE USED TO FORECAST RETURInI-ING ENROLLEES 
FROM ANY SU3P0PULATION, E.G., ENGINEERING MAJORS.. / 



THE MODEL CA*^ BE OPERATIONALIZED VERY..CON\/ENI ENTLY AMD CAN BE 
COMPUTERIZED. " / 



REQUIf-lfS NO EXPENSIVE DATA BUT USES OfCY ROUTINE DATA COLLECTED'- 
BY INSTITUTIONS. . ' - . ' ' - 



AN.INiTERIM TECHNICAL REPORT, .ON ^ 
THE FRESHMEN REQUIREMENTS STUDY 



This report is on the work done so far in an effort 



to*build^a predictive ^nodel for freshmen course requirements. * 
'It bak^ess^ntially validated an initial . approach . to the 



proS-'Xem where past enrollment behavipr has. been taken as a 
' predictive factor. The analysis oS^the 'data specially coitiplied 
for feh^. study show that the more numerqug and mor^-recent 
a student's enrollment has been in tjxe past, the higher his 
probability *of ^return is. The differ-ence betv;een the summer 
quarter and otiier quarters have cilso been trfou^t out: A model* 
that fits V^ll ibhe,- ob&^ved^ d*a€a^ on re turn probabilities has * 
^ been constructe|^ . ]Jsxng this^ m6d.el the effects of the 
various components of past behavio v af fe^cting *return*pro- 
bahilities have been measured. ' » * 



.The repdrt also . discusses the geife ral - Mathematics o£ pif^o- 
• babilistix: predictive models emphasizing the practical aspect 
• ' oF*designing a system \o operationalize a model. Finally 
ethe vork which remains' to be undertaken in t^s_ study is 
described. , • ^ ^ * * * 



r 



) Introduction^ > . . ' 

\ f 

The- Freshmen Requirement Study (FRS) has the objective of developing 
a model for pre'dicting demand for 'freshment requirement courses in FCC. 
A preliminary^ model was devel^ed by Gramza, Diaz and Shore in July '72 
which served as.a basis for undertaking an exhaustive study starting 
Nov. '72. A. review of the status of the study is presented below giving 
emphasis to' bdth what has been done a;id what remains to be done. . 

Since tfie demand for. freshmen courses *is from fresh enrollees as welh 

as from return no enrollees who failed to complete the course in their 

^previous qiia]-tersV the stud:' will have t<. address- itself to the task of 

predicting a nuniber of variables such as 

r)nv:i\}fQr of 'fie w enro-llees / possibly v/i th a breakdown By credits 
- / 

trdnsferrecf-for freslVnen requi rumen L courses 

iiVnumber of students returning from previous quarters (concen- 

trating specifically 'on students who have not completed 

freshmen requirement courses^ .and \. 

iii) number of enrolled students who are still to completf-. the « 

freshmen requirGirent course and_ demand that course. 
* • > * 

The tWcd. variable mfifitioned above is influj&r.ced heavily by such factors 
as caunselling and the;)number of students whp can be accommodated J n the 
freshmen courses that particular quarter. • Howler,- if we are concerned 
-with e'stimating demand for freshmen requirement courses with the^pur- 
pose of planning enough sections, then we might justifiably Ignoi^e the ^ 
third factor, . \ 'i> 



-13- 



\ 



r 



earTier, with'this assumption we are effect! veV doing away with the 
necessity to .consider the third factor listed, above and producing^a pre- 
diction^that is niore relevant^t-CL^ecision making regarding space,' faculty 

and other resources p-lanning. Fcicto>(i) (nantely, the new en roll ees) 

\ '*. 

would be an external input parameter to th^.predi ctive nwdel . 

. - - ' \ 

\ 

The strategy for developing the. model is as follo\^; Consider the 
pfrobleni of predicting demand for course X in Quarter i . The set of all 
yiffiliate -students at start of Quarter J is the srfurce population for 
all returning enrol lees. Some affiliate s-tudents have passed course X 
and^the rest have not - call the Ul>tjer subset S This is 'the popuHtiqn 
of^tercst in predicting demand for caurse X. The basic approach lies 
in parti^Toni;^ S into subsetsS^, $2, ... ^Jj' which are mutual ly .gxcI usWe 
and exhaustive V S in such a way as t!> accomplish these following - ~ 
objectives; : 




EacrT^ts is a homogeneous group of affiliate studen-t^ 
homogeneity bging used in the sense that the proba^iHty for. 
returning to school is safrie for ^^JJ^^eInbef^^ -the set. Perfect 
, ^* ' homogeneity can seldo^i be apM^ved in f3rdcticG, since so^ many 
sociQ economic characteristics' affect tfie .return probability 
' and almbst any^wo students will have, differing probabilities. 

Hoover; by..,bias^ ddfinitian of the sjjbsets on the most importaat 
jpr these, variablesT^We^^^ dose to their ideal. 



The demand forecast i^nJrliig . t^he third factor- would'gi ye the number, of 
students who need to take a .freshmen r'^quirement course i|! a particular . 
quarter and planning for the c9urses6houldf.be based on this number. 

'To the extent the role of coun'seling is to advise the students to take- . 
the courses. at the earlies.tr opportunity (subject only to the number; ^f 
sections planned tfiat quarter) the effect of counselling need not be 

. taken into considoii^atiog separately; 

Thus .in this study >the-ob3ectiv:e has been/set as one of predicting for 
'a given quarter* and given f reshiji|ert;.)!:equirement course the number of . ^ _ 
students who ^'ould need to take that course. Of the two factors (i) 
and (ii) li-sted earlier, bur attention .will be initially on the^ se 
one; namely th?* demand generated by returning enrollees. The data base 
and anaVyUical. techniques for predicting nqw enrollees are more difficult 
to developwhile the prediction, of returning enrollees can be performed- ' 
with only, data curr6ntly available in FC6. 'Turther. to a cer;tain extent 
FCC can regulate the number of new enrollees so that it makes sense to 
treat this, variable as an input ..parameter in a predictive model rather 
than a variable td be itself predicted. ^ 

wi^W ..ese eon1ia..aUns ^ t.e ...is. t.. scope 6. 

..e .reshJe, .egui.e.ents StVi.-as' been specified a. one o£ 
per£or.in, the ne.es.a«, s.ati. ticai .naiysi. on the avaiXaH 
VCC aata to develop a'^odel capahle o£ predictin, the, demand 

, ■ - ^ ^^.■,v<=«:.c from returning enrollees, 
for freshmen reguiremen.t course, from r 

it being a.su'.ed that aM those who need to take a coupe 
,,ould inac'ea create a demand for ^he course, « As explained 



^ (C-2) FCC data shouldma.ke it possible to classify a student in--' 
V to one of the subsets , \ .. S^. In other words we should 
not use in \he predictive model any variables orv^which we cannot 
have data. Since the ^freshmen requirement ^tudy would construct 
the predictive models only .on the basis of analysis of available 
data, this r,equirement should be automatically satisfied. 

r 

_ - ^ ~ — -- ^-t- -^---i, # 

There are two' other properties(C-3) and- (C-4) whicli we- would, 
like the set S-j,.,, S^^'to posses but '^these would be presented ^ 
later at a more appropriate p^ace* • ^ . * 

Assuming we have discbvered a satisfactory partitioning S, ^ * . 
S2,.-- \ of S, the next task^wouTd. be t# obtain 'the. best 
estimates of th-e parameter^' r^,.-. ♦ r^^ where: * ' 
^ x\ = the probabilityv that a studerrt belonging to set S:: . ^ ' 
returns to school in quarter i . ' * 

It is expected that with the retention data 'avail able, this 
estimation can be done fairly accurately. 

The prediction of number of returning stude.nts in quarter i 
would be- then given by thfe*" expected va-lue 

where s,. = nu^nber of students in set St; j =1^ , . .M \ 



17 



X 



-16- 



The''vaxianae associated with' the eseiHilite is given by — ' 



* ^ 



(2) 



m 



In order^ that our 'prediction formula^' (1) remains invariant' 
over time (except possibly for predictable differences 
between the four quarters of a year) , it is necessary that 
the probabilities rj show stabili^ty over years. Agai^ this 
is not often met in -practice, si.noe^ren4s and even abrupt 
changes in return rates are experienced for a variety of 
reasons. In practice, what thi^s^^<ft^ns i*s that we have to 
desig.n a system v/hich continuously watches 'for trends and 
-changes and make apnroti/r ia te updating of the probabilities 
*r^. Yhese and many other practical considerations in im- , 
plemerfting a £>redictive model are discussed later. 



The formula (2) for variance 'a]^so gives- u&-a jclue *as to . 

■ I " • • • 

what should be considered a satisfactory parti tioning • 

Our>&im should be to keep the variance as smgll as 'possible. 

From *(2) it is seen i^ha:t, the variance ajchieves the minimum 

-value of zero, that is, estimated ^HD is perfect and -has no 

error asbociated with'^it,^ if each r^ is eitKer 0 or 1. ^ 



* The variance i^ormula is strictly valid only ^ if , assumption 

(C--1) holds. • For nonhomogeneous subsets the 'formula 
• - would providfe an upper limit, i.e., the real,, variance would 
. be less. 



18 



-I7i 



The maximum of (2) is at-ta.ined when each rj = 1/2. .This 

implies that the partitioning should be in such a way that 

-7 . 

the return probability r. for ekch set-is as cloae to- zero 
(or. one) -as possible.' In example we should prefer a parti*- ^ 
tipnTng {S-^,S,2) with probabilities (.2, .9) to a partitioning with 
^-obabilities (.3,. 6). With these remarks, we fomally intro*dude two 
. otlier requij^ement (C-3) and {C-4) on the partitioning subset Sj_, . . .'5^^^: 

{C??;3) Partitioning of S. into S^,...S^ should be so done 
that the associated return probabilities r ,.. -rj^ are stable 
over tire ^ • [There is reason to believe thit-if (C-l) is 

f 

■satisfied then (C-3) is also'likoly to be satisfied). 

. (c-4) Partitioning of S into S^,...S^ should be so done 
that -the associated re.tarn probabilites r^,...rj„ arc all 
close to zero or one. »^ 

This b^ckg>;ound discussion on the underlying concepts 
of probability prediction models can help us Co clearly 
' recognize what it is that we should, be searching for in "our 
sratistical analyses of th4 FCC data. In the next sectian we 
describe the approach wo hXve undertaken to discover an ^ 
efficient partitioning schemes 



* 2. THE APPROACH ' 

. « 

♦ « 
One can hypotKesize a host 'o'f var^afbles which influence the 
return probability of ar student : Among these are: ^ 

# Socio economic variables such^ as employment status 

* 

and source and amount .of income ^ \ ^ 

• Demographic variables such as ^gp, sex, martial status 
and family re sponsiblity * \ 

t Academic performance and"" environment variables such 
.ascredit hout accumulated, pass/fail experiences in 
the previous quarters and other general experience 
V/ith'the college, area of s.^dy , etc'.^ » 

0 Personali'ty trait variables such as ^ambition, com-- 

mitment to \ollege educatipn/degree and int^ellectuaii ty 

It is obviously no,t an easy task to take these myriad of^ 

factors into consideration in one com^>rehei5sive model^ 

Nor does it seen) possible, to sePctrate the significant vai^^l 

t ' / 

from the non significant o.nes on ah a priori basis^ v/i^out 
making appro-pr.i ate analyses of actual data. Since /xhaustiye 
.cUnalyses of this nature c-.re likely tg take pons i^feerjr^Te time 
even if we Save suitable data, a more indirect and expedient 
approach has been taken first. This -report will mainly 
deal with a discussion of this approach and its, results •. 

-J.8- 



20 




In this approach, the past enrollment behavior of a student 
has itself been sought as a predictive variable of;hi^ 
future reenroHment probability. The rationale for this » 
step has been that, whatever bejthe variables effecting. ^ 

enrollment the imprint of their sum ^ef feet* wcul'd be, left 

/ 

on the past enrol Iment' behavior . Hence the past enrollment 
pattern suitably quantifieci, might itself be a predictor. - V 

assuming that the variables which operated with the past 
continue to do so.- in the future also-. Shch a predictor may not be 

efficient as one using the underlying Variables, but 
this handicap may be mor;e than compensated by virtue of 
the fact that /the necessary data for' forming the predictor 



are res^-dily available. . • ^ ' 

Tire enrollment pattern may be quantified by/' simply observing 
^ if tnbvstudent did or did not enrioll in each of the .previous 
four qu&rterst * Enrollment: i-n' a par^cular quarter is in-- ' 
dicatcd by the nur.eral 1^ while non enrollment is indicated 

\ r 

-by the numeral '0'.;^V?it*h this binary notation, a stud.ent 

who, enrolled three quarters ago and not in others in ^Jihe past 

• . \ < ^ ' • • 

foui^qaartcrs, can have his pattern represented as '0100'. There ar6 

\ c 

sixteen possible pemvrcations over four quarters each giving rise to a 
^^l^-^parlicular enrollnent pattern The*^nrolln]^nt pattern represented 
inTb^is ^fashion are used as our pr^diotpr variable. 



• 1 



t 



21 



-20- 

In the scheme pre.sented here we ignore students who have 

not enrolled in.any"o£ the previous four quarters** 
* * 

In other words the permutation '0000' is excludedy leaying 

us with 15 patterns. The jus tif ic.ation for this is the 

observation that a student who continually absents himself 

four quarters has less than 10% .probability of ever r^-- 

turndng to school* Table 1, shows data compiled from a 

FCC Computer Center Report illustrating this point. \* 

There the number of students dropping out in "varrious^qvsartexs 

is shown. The number of^these dropouts who return five 

or' more quarters later (that is after at least four quarters 

of absence) is shown in the last column of the Tabi« and 

is seen' to be seldom higher than 10%^ - The percent-age 

which return to school ^exactly on the fifth quarter after 

d^rc^poutis even less than this, and the same . data i> Table - 

1 .shows i^ to be^iless than 4% for all quarters considered 

there • , - > --v^ 



** • . » « . 

FCC Compu'tbr Center Student Retention' Study Report 
dated September 5, 1972, ^ 



Here and elsewhere in this report the"previbus four 
quarters" refer to the four quarters immediately- 
preceding the on6 for which enrollment prediction is 
to be made . 



What this means is that in predicting^'enrollmejit fog, a 

particular quarter $iffiliate' students who have not eHtfplled 

' . •- \ • 

xn the previous four quarters may be ignbred witho.ut in-^ 

troducing any large amount of error. As mentioned earlier 

this has been done, in the work reported here. However, 

it must be noted that the methodology employed can b*e y^plied ^ 

* / / ♦ 

to enrollment pattern over any number of. -past quarA^rs,^ 

^so that if the present limits imposed are to ^bi^elaxed ^ 

\ 

in future it could' Te done without 'undue dj^iculties. 



To represent th^above in terms^f set theor^f notations 
introduced earlier: the s^jc S of total population isi ^ 
considered to be all af^liate students who arc yet to p^ass 
course X ard who >^a;^ enrolled in at least oit<5 of the previous 
four qu^t^s^//^T^ set is partitioned into 15 subsets 
Si I S2 Si5 depending on tfie past enrollment pattern 

I'he f0llowing sec^t-ions of the papers present results of 
^^alyses • per'^oyiprt see how well this partitioning scheme 



CO 



uld ^function 'f^r^dictive base of future enrbllment. 



0 



23 • 



3. ^ATA PROCESSING AND ANALYSIS 



The FCC Computer Center undertook on behalf of .the^ 

Office of ]^nstitutional Research a special processing 

of FCC grade file to produce data for the anafysis. 

Among othej: things this task involved forming the sets 

S ajid its fifteen Subsets (partitions) fo,r ea'ch 'Fall, Winter, 

Spring and Summet quarters of academic year'72. . For each 

of these su&sets -the number in the subset as v;ell as the 

number v?ho enrolled in the quarter under consideration were " • 

computed. Table 2 contains this summary data. All the 

> 

ahalysis reported below were performed, on this summary 

. y ■ ■ 



-data . 



The first analysis was yo see if there vzere sijgfiif icant 
'differences among the 15 enrollment patterns and the 
four quarters with regard to *reenrollment probabilities''. 
Even a visual examination of Table 2 reveals §troij^ in'- 
dicatilTlis^f between-patte'^ndif f erences. Also, summer 
quarter is evidently different. g^t ariy differences between 
•/^he other three quarters are not apparent' to a casual 
observer. Hence a two-way analysis" of variance (with the 
fiffeen enrollment patterns providing one classification 
xa,nd.the Fall, Winter and Spring quarters proS^iding the 



r 



CO 



/ CO 



<y 

0) tJP 5-4 







CM 




Xj ^ :3 








i 


















H 








































M 








>j 






0) C) 






Pi 








o 






£5 H Xi 




















2; CO 




w 






























< 


w 








Pi 




















a 


< 




<D 






D CN 








< 






G) <^ 










Xi o 




Pi 


H 








w 






:5 a o 










:2: 






















W 










h:) 


a 


M 




^ 4J 










<D <D 


^< 


>< , 






XJ C W 


Eh 








e Xi 






















V 


w 










H 


• 


















H «- 




















H 




















< 




















o 






0) <^^> k 




Pi 






XI 






























:2: nJ Pi 
























EH 










W 










Pi 






0) 0) 










XI C to 










e -H xi 


























































c 












-p 














o 


4- 










4- 


> Xi 














W c/) 



ocoor^vrcNr^cNcovDCM^co cn 

nTH^ COr-T"^ rO ^ ^ VD 



vi)<NrO^Ocr»vDCOHinvDOvD» H 
^ O CO 





o 


vD 




VD 


H 


fH 


o 




vd 


VD 


CN 


CO 


VD 


O 




in 




CX) 


H 


VD 


CO 








H 


CN 






r-i 














fNJ 


























O 


H 


VD 




CO 






ll*) 


VD 






o 






o 










O 


CO 


0^ 


CO 




ro 


ro 


\o 


*M 






o 


VD 






CO 


H 


0^ 




o 


O 




r-i 




H 


in 








VD 


CN 


CO 


H 




p 




O 


o 


rH 


H 




0> 


o 




0^ 




h" 


O 




in 




r- 


CO 


CO 


H 




H 




CO 


CO 


CO. 


CN 


ro 


VD 




CO 




H 
















H 




H 




(0 






















H 
* 












in_ r-l 






CN 


CO 




ro 




^ 








rH 






0^ 






O 


0^ 


ro 




0^ 


H 


CO 


H 




CO 




in 


OJ 






ro 


CO 


H 


CO 


ro 


CN 


H 


(N 


o 


rH 


H 
















H 








CN 








r- 


00 


f) 




o 








VD 




0^ 




H 


IT) 




o 




r- 


vT) 




in 


10 


ro 




CO 


H 


o 


,in 




o 


in 


H 




CO 


o 


o> 


m 


in 




CO 


CN 




r-i 




H 


vo 




CO 




VD 


(N 




H 


CO 


.-^ 




































in 


(M 




in 


CO 






o 


CO 


H 


ro 


CO 


VD 






ro 


fO 




o 


U) 




in 


CO 


H 






CO 


CN 


VD 


















H 








J~l 


H 


VD 


























H 




H 




ro CO in 
tn in o 

^ CN CO 
H 



- ^ ro ^ ^ r^'t^ r^CTkOCMCOt^^CN 

orocooHcoocNQ^^ooinooin 

in<r>incNr^fN<r»o^voino^rHHO 
irirof^iHLOvDvDHLn'^cocNr^r^o^ 

inG^ino^OrH4H^V£>vDfs»roroOLO 

r^cTkOrocNjo^OHcNHCor^r^vDO 

CNH^ r^CNiH iH iH CNVD 



OvD "^Ponn^' iHO^O.vDOS CN^^«rO' 

oo focsjrooo^H^ino^t^Oinr^ 
inminro ro.rNjH cn inHr^t^ 



HOHOHOH'PHOr-IOHO 
OHHOOHr-lOO. H^O'CbH 

ooohhhhoooOhhh 

boOOOOO»HHHHHHH 



2 



other) 'Of the data deemed appixSpria^e , to test statist 

tically various hypotheses re^^di^ be tweeiJgroup diff-/ 
erences. Since the observ^/^P^l variable i/s a propo^^ion 
based on sample sizes thAt'^^vary from group /to group*/ a ^ 



two way ♦analysis of vanriance with \j,n^qual i:ell 'si^^s for 
proportions was necessary. Th^/propor^ions ij>;:^ch cell 
were converted to logit-^^ale by - the forjtfvKra; 
y . 




13 



In gj^j + 1/2 



where 



ID 



the log it scale obsbrX^tion in cell ( i^ j) 



■is*' 



number of students (in cell ( i , j ) ^ 
number of students in cell-(i,j) who enroll 
in 'quarter j 



* * 



The analysis of variance of t'h^-logits is lin Table 3. ^ 



cell (i/j) refers to^he grqup of^s'tudent^ with enrollment 
^ ^ pattern, i in .quarter^. \^ \ ' • ^ ' 

* The analysis of variance Vas performed on the cell os/360 
System using a specially prograrmm^d FORTRAN' routine called 
UNOVA 2. For details of the pr'oceduro see G. shedecor' 
"Statistical Methods" 6th Ed, Iowa State Press, pp. 497.>V., 



-24- 



20 



•ERIC 




CO 

W 
Eh 



W 

o, 

CO 







« * 








^ y 


CD 


D 


H ' 


fO 


o< 


.CM 


CD 


CO 


H 


O 


is 


<N 
• 


O 
• 






o 




CM 


V9 














CO 




























m 


a 


VP 




01 




vo 






H 






O 




• 


• 




co- 


o 




o 


CM 


D 


o 


H 


CO 












> 










CM 


Q 








• 


0 



*M 

in 



00 
H 
0^ 
CD 
N 
• 

to 



CM 

H 



in 

CM 
VO 
O 

in 
' o 



H 



00 
in 



\ 

* 
o 
in 

m 

CN 



CX) 
CM 





b 






b 


5a 




a 








O 


o 


'< 


hi 


b 




M ' 












EH 


D 


D 




< 




U 


O . 


^ «• 










CO * 














C'} 


to 




CO ' 






, f 


M 


C 










0) 


M 


<D 


' a" 






+3 


0) 




H 




ji ■ 


u 


4J 
















Ol 












X - 






, Oi. 


04 




Pi 



^ r> 

00 
• 

op 

H 



CO 
O 

in 
in 
• 

in 
in 

\0 



to 
hi 

Hi 

u 

M 
14 
IS 
B 
» 



4J 
O 
O 

o 



, - 



o 



,4J' 
(0 
0) 
4J 



CD 

CM CM 

K 



u 

CO 
0) 











c; 
























d 




0^ • 






• 


CO 0) 












H 0) 0 








c?> 


CM 


♦H 0) 






0) 


O 


H 


^ ' 




0 <D^ 








4J (1) 












VI 


0) 


0) 0) 


o 




>4 




:5 


H > 


0 




CO 



.ERIC 



27 • 



The analysis shows that the i nte^acti^on^ 
and enrollment patterns is it s^>f' ^highly sig^rji f icant 
determi-ned .by comparing' the interaction sum of squares 

4 

inst the percentiles of a *^L^ distribution,.'/' This impli 
that, the ef f e^i>^4Q^eni:o.3,J.m(3,^ patterns > is dependent on , 

^^y*- ^^^^ ^ C 

the quarter so that one cawiot ^alk^f ^ ' quatt^^ .ef 

^ \' ' 
or jenrollment effect' in, iso lv|tion5^^--4lore sjpecif icaJLly , 



inces 



the existence of "interaction shows tl>at the di: 
between cnrolliucnt Pfitterns^ i.s'^^i tsolF'-^not co-nctant, J>ut 
varies from quarter to quart^er. Tabid "2 bears out this 
points ' Compare, for example , the jenrollment p^atterns and 
"1011" the sole difference betwee^xthe two being !,that in 
the latter the students had also enroll^ in the immediately 
preceding quarter. For Fall" '71 the differ^ce between 
the two patterns in their enrollment probabilities was 
0,39 while^ .for Winter and Spring it was .46 and .'58\^e'- 

L * ^ 

j^vely, ConsislEerrtrrv — similar observations, are made^ 
Vhcn ever ^EliT^r-pLaJiterTre-Ai^ only^^in l^he last quarter "X 

enrollment ar e .co^ilp^r^ between t.he two 

patterns is less ir) Fall than in SpriKjr-orL Winter. 




A simple explanation may^be offered to account for the 
interaction. The quarter iiuTiediately 'precedihg • 
Fall is Summer which can be justifiably considered as* not 
the same as the other three quarters of an academic year. 
It may be hypothesized that enrollment behavior i-n a 
Sumiper quarter is not as strongly related to -dropout 
tendencies in the student as it is in t.he case of the J 
other 'regular* quarters. Hence'the difference between ^ 
•,'two patterns such as '1010' and '1011' is less marked in 
Fali*'th^'4i^. Spring or >^nter^^^^^^ ^ - • 

In connaring ^r.r ollitien t pettefjif due conni doration miu:L * 
therefore be given to difference between Rummer and rfegular^. 
quarters. Subject to this que 1 i f i c a t i ojLr^ t h s ^ ji a t a in 
liable 2 strongly suggests that reenrollment prbbabi lities 
are higher for s tudents* v/ho have shown a consistent en- 
rollment behavior in the paS't. * . \ 



h rearrangement of the data in Table- 2 brings out strikingly 
clear the var^ious factors af f ecting^ ihe probabil ities • 
Table 4 has been preparedf^ after thi^y^arr angeme nt v;here 
we have the data grouped byTPal^ , Winter 5^n*d Spring 
quarters. . > ' , * , ' ' 



T 





in* 


IP 




in 


CM 








• 


• 




d 























CO 


CM 






♦ 


* 




o 













00 








CM 




* 


r-4 






r-4 




in 

CO 



O, 
• CO 



ID 



ID 



ID 



O 





























Ok 












Ok 






o 




* 




• 


« 


* 










ID 




ID 






(N 


ID ' 


ID 





00 



in 


in 


CO •* 






CO' 




in 


• 


CO 


in 


in 
* 


in 






00 




in? 















CO' 



r-f 





o 








ID 




CO 


* 




« 


• 




00 


o 


in 


* 









iH in- 

d . o 
ID 



o - 



tn 

ID 



ID 

00 



o 



in 













CM 




in 


00 




o 


in 


• 




in 


in 




o 


in 






00 


ID • 


#0\ 









CO 


CO 


O " 






o f 


CO 










« 




* 






d 


r>4 









CO 'si', 

XO^ CO 

AD r-V" 



O tH O » f-i 'Q 



r-l 



0 
0) 



0 
0) 

p 

0 



0 
0 

w 
to- 

0 

C5 



(1) 

U 



0 



0) 
O 

o 
}^ 

w 

4J 



u 
(1) 

I 

O 



(D (1) 
> > 

0*^ o 



O .H 

O O 

•O O 

O . O 



C - 

U 

0) 

4J 

4J 

(1) 

:5 

0 

H 
w 
c 

0 
w 

0) 

0 
U 

* 









}^ 






(1) 






4J 






4J 












a< 






q; 




« 


.c 










4J 








o 




'O 










0) 


OJ 


w 


(1) 






H 


0) 


c: 




H 


o 


•3 








w 






(1) 


Q). 
















o 


(1) 


(1) 


o 




0 


* 










0 
0) 



30 



• • • -29- . ■ ~ ' 

Howeve^'i in desciribing the enrollment ' pattern v/e use 

only the three preceding regulaf quarters (i . e* / ignore" 

3 

the preceding summer quarter). This gives rise to > or 
8 pnrollmen't patterns described by three-digit binary 
numbelrs shown ih the extrem.e left hand column- 
of the tab^re. .The summer enrollment status is considered 
in the table by h'aving two columnsj for each quarter; the 
first column corresponding to those.'who did not enroll ir\ 

tha' preceding summer and the second column corresponding 'to those who did, , 
-'Thus, the entry "4.12" in the first ..column under Spring .quarter- 

I . , i ' r* ^ . " 

n agrinst tho pattern '1C0-' iae^;ns >that the reeivrollment 

probability in Spring '72 vte's 0;412 for styidej^-fes who' - 

a) were enrolled in the pr^ececing Spring ( in Spring 

•71) quarter but , not .in Fall or V?in-ter and 

— ■ rr : * - . - * - 

b) were not enrolled in the preceding Simmer quarter 

. * ^ •The order i*n which the 8 enrollment patterns;^rc liste'd is 

also worth noting. The 'first rov; '000' signifying 
' ' students who^ did i\ot enroll in any of the three preceding: 

regular quarters. The next one is '100' when the students 

f-^l;^^. .had cnrol.lod only three quarters ago (ignor ing** any Summer 

' ^* ' • . • ^ 

quarter enrollment). By hypotheses w^^6xpect tne' group if' "'H\i-r,{. 
^ • ' ' . V • • 

^ r '100' to hav^ higher retention probability than '000' • 

- «• ' . ' . * 

31 " ;•■ , ■■ 



.^30- 

Simil&rily the third pattern in the list, namely '010', , 
is* expected to rank highe^ than '100' if we postulate 
further that with more recent enrollment experience, the 
reenroHment- probability gets higher. With these two- 

* 

hypotheses as guide, the 8 'patterns were arranged to 

produce an increasing retention probability, -The' 

pattern '001* follows '110' in the' list with the' expectation 

that though the latter represents more number of quarters 

enrolled in, the former has more recent enrollmen. 

■ • • 

experience. * ^ ' ' 

It is rather very g'ra.t if y i ng that the Table 4 da ta ^ f ollov.'S 
■ tfift "pdtt^^-ir oxs^eqted.tljius . giving ^c^ec^^ae .^.q the hyio-jf^e^s. 
i^itli only a f ew. exceptiohs in all th-e coluif.ns , - the probabibi 
lities increase as/one goes down the rows. Further, the 
c'olu-rnns representing Summer enrollr.ent have -higher probabi- 
lities than the corresponding columns representing non- 
enrollment -l-n-.^me r . .This is further in line .w.ith our 
hypoth^eses. •.• • ' • -> . *. 



The few exceptions noted mostly oj:c?ur in the Summer quarter 
'co-nfirming earlier observation^ that the phenomena affecting 
Summ.e;r en roliment are app^.i^gnt-ly, di f f er ent . 



4^ 
« 

♦ 



32 



4. CONCLUSIONS PROM- THE AjlALYSIS* 

To summarize the observations made during the analysis, 
one iQight conclude the following: 

1. Past enrollment he.havior of a student does seem to 
provide a viable ba^e for building predictive models 
of future enrollment* • 

2. Th^ mo'r-e often a student has -enrolled in the past; 
the higher his probability of returA is. 

3. The 'more recent a'student*s enrollment is, the higher 
his pi-obability qf. return 



4. Eurollmen'^ behavior "in -a SumiTier 'qusirtpr' is different 
from those of the other quarters.^ 

>The above observations are qualitative. One might be in- 
terested in knowing, for example , preci sely v/hat quantitatively 
is the effect of enrollment two tjuarters ago in. a 'regular' , 
quarter on return probability for this quarter. The be§t % 
way to' obtain such* quantitative measures is to fit a model 
incorporating specifically parameters representing these 
e'ffects. Such a model was constructed and.fitsto the data. 
The model construction arid -results arS describ|^d below^ 

A plausible model to represent the effect 'aVof a factor 
influencing the probability of^an event is: 



-31- 



33 



•32- 



p' = p + a(l-p) 



- (1) 



where p* is the probability of .the event v;hen. the factor is 
operative and p is the probability of the event where it 
is not known if the f^ct"pr is* operative or not. 'a' in (1) 
can be considered as a proportion by which. (l-"p) , the pro- 
bability of^the event not 6ccurring, is reduced. It can also 
'be considered as a conditional probability in the following 
sense. To make this explanation simpler, assume, the event 
ds "reenroilment in Fall 'IX quarter", and the factor is 



'enro 



llment two- qu^ate^s agt"-(i.e.; Spring '71). 

. V . ^ ^-"^^ ^^'-^ ^ /a,^ ...- 



p xs lyne 




gei^^rr.l reenrrOl Ir.ent probe bi^lit-y for F.aJiI '71 -and 'a' *- 

the probability of reenrollment of students .who had ci^olled 

"two qi:arters ago. The reenrollment probability in Fall 

'71 for students who had -enrolled in Spring '71 can then 

■^bc considered as affected by tvro forces: « one representing ^ 

< 

thq ' attraction ' of the Fall '71 quarter to students for 
rcenrolliaent and the other representing the effects of 
enrollment 'two quarters ago'. If either of these-two forces 
induce the students to reenroll, we have realized the 
event. V?ith this as a' model ©f.rhe process, the probabi.Uty 
o'f the e^^^ent can be easily written dov;n as 
. p' = p+a - ap = p+a (1-p) 

with the*^' additional assumption that th^ two f[orces act^ _ 



34 



ERJC 



ERIC 



-33- 



Statistically independently . 



With the same reasoning it can be g^neifal ized -^^^^ 
there are .n statistically independent factors acting^-on a 
probability then the combined effect! of their forces 
would be ^ ' • . • c 

p =♦ 1 - (l-p.)* (l-a-^) (l-a2) ; - . (^-^n^ , '^^^ 
where, a2 r are the individual effects of the 

factors. 



^ ' - y = In d-p'' ) 



1 

jg^lf. define /; - ' C ^ 




In .(i"P) * 



and b ^= In ^•'■"^i) 



then, (2) can be^written conveniently as the additive model 
m can be cons:j,dered as the general mean in the. ip.ode;L^. 



It is this additive model using logarithmic tra^nsf ormat i^n 

(3) which has been employed in constructing the ^nodel to 
fit reen'rolDmcnt probabilities. » 

In keeping with the analytical findings (1) -^(4) above, 



This model is .the same as that used for oomputing* re- 
liability of a system with parallel, redundant units 



35 



a 



the -parameters considered in the mod 
A in =^ general mean 

^ll"^ effect of «nrollmaiit---l^ quarter ago in ,a 
*• 'regular' quarter 

h)L = effect of dropout 1 q^ijarter ago in 
•regular' quarter 
I ^13"^ effec^of enrollment 1 .qxjajter ago in the 

summer quarter 
bj^^ = effect of dropout 1 quarter ago^pr^thie 

Simil-ary ^21' -^22' ^23'/>24' b3^' '.b43' 

tire defined on the enrol l-n-cnt t\^^D, three jtnd" four quarter^ 

a^b • . . ' 

There are 17 parameters including m note * tha^t effect of ^ 
enrollment and dropout in a Quarter have been s.eparately 
introduced as two dif fe*ent' effects • The effect gf^ dropout 
(non-enrollment) in a particular quarter may not., just^ amount 
to leaving the overall probability p undistrbed but to 
actually- ^decrease it or otherwise affect it: 

V7'ith'this model .one can easily write' dowji the protxability 
p of, say, a student with 'enrollment pattern ^^llOO'- re- 
enrolling in Winter 72 as follows: ^ 

m (l-p) i ,^41 + i^3^+ + b^2^ ^ 



•35- 



where 



= general .mean 



P = the/efjfect of enrolling four quarters ago in* 
a 'regulai?'- quarter . 



^ ~ eff^x^fe— of enrolling three quarters ago i'll 

a 'regular' quarter 



^24*^ tl>e effect of dropping out two quar tejrs^Xgo 



in a summer quarter , and 



^^2*= the effect of dropping out 



a regular^ quarter- 



quarter ago in 



The- seventeen^ paraineters (j^V-^d ^ ' ) were estimated 
Loast Square Method firming the 'above irtodel to t^# 60 .ob- 
^servations m^^^cTble 2. 

APP^O^^ S gives the ddlitaxls of this,^#?j^-ocedure • As shown 
f€here, the model in such that 
be arbitrarily set to 
would be then 




The values were; 





b 

11 


-0 . 64888<7 


I ■ 


b 

1 2 


0.40284 


* 


b - = 
13 > 


-0 . 840544 




b 

21 


-0 . 214826 




•-"b 

22 


0 . 261992 




b ' = 
23 


-0 .,485431 




b =' 0.206000 
32 

b = -0.160537 
3 3 

' b =- rP . 54608 
^43 " .-0.^1/9429 



179431 



/ 



Since- four the parame1:e^ were, given arbitrary^ajji^s-r — - 
the above numbers should not be given axi^'^soluto meaning^ 
but have significance *only re-l-^^iye each other*- How 
do we measur^dtite^^^ "enrollment behavior .one quarter 

axjo *'f rom^ the above data? A valid. moasui^Q is " ^j.!*" ^1*2^ 
which comparos tiie 'effect of enrollment one gua^^ter ago wfjt^h^ 
tl>e e^ffect of dropout in the same quarter^ Similarly 
( h^^- b22) measures the effect enrjillment behavior^^wo 
quarters ago etc. These quantities ^are computed oird .shoWn 



/ r ^ b - b = — 0.28 

/ 31 '32 , 

- b = . -0-28 

42 , 




/f^rom model (4)" ±x is clear thaii:^a Jiegat iye ref^reserv^s ^ ,^^vv 
a force^liat' tends to increase the %6renrollment probability 
j , Thus the above data clearly and quant itiatively shows 

that: 



1' 



a) enrcllrapnt in any of the prev,ious fox^r quarters 
increases the reenrollment probabili ty 
. b) the more recent the enrollment, the higher 

r 

' beneficial' impact on reenrollm^nt 

^ ^ ' . / ^ , 

These, of- course, v/ere the conclusions offered earlier, but 
we bave now quantified these hypothesized effects. 

The effect of summer quarter enrollment one quarter ago is 

similarly representd by (b^^ - ^14^ --^^^ since b^^ set to 

*> 

7,(iXO b itse].f is this' measur.c . V7e therefore have: 

1-3 - „ , 



b 


- b 


-0. 84 


13 


14 




b 




-0.49 


23 


• 24 




b 


_ b 


-0.16 


33 


34 ^ 




b 


- b 


-0.15 


43 







-36- 

Note ag'ain^the positve effect of enrollment in^Summer 
and "how it is stronger for more recent quarters. Further, 
a comparison v/ith regular quarter enrollment effect shows 
that the_ JSummer quarter effects 'ar^ generally smaller .<> 
This, is in conformance with the fourth conclusion presented 
earlier in this section. 

Finally^ w-eCihW ^'^^ the parameteTr values gerner.ated "^ove a^nd 
see how v;ell they es^^ate the 'reqnrol Iment probability. 



This has been done and presented in T.abi^e S whfere-for eacli^ 

of the sixty 6bservataons ^(f our -qua'rber^ X fifteen enrollment 

patterns) the acLu^lly observed ^conro llnvent Fi-'obabil ity 

as well as thosexes t imated with the mo4el are'g'iven. . 

The difference between the two is alco indicate^d. The model 



is seen to fit the data very well, especially in the regular 

- ■ , " ■ / - ■ . ( 

quarters- • . 



, . '5%- IMPLICATIONS^FOR CONSTRUCTING ^ - \ 

THP:, FORECASTING .MODEL \' 

What do all the analyses above imply in regard to our 
effort' to construct a forecasting model for the FCC affiliate 
population? Essentially,' it h^s shown that is is valid to 
* partition the affiliate population by the past enrollment 
behavior for prediction pruposes. There ai?e clear differences 



40 



TABLE 5 



PATTERN 



ACTUAL VS ESTIMATED RETURN PROBABIi,ITIES 

MODEL 

. ACTUAL " ESTIMATE DIFFERENCE 



FALL 
1971 



V 



WINTER 
1972 



0001 
0010 
0011 
0100 
0101 
^OllO 
. 0111 
' 1000 
1001 
1010 

• 1011 

1.101 
1110 

1111 

0001 

• DOIO 
0011 

-^0100 
0101 

oilo 

^ 0111 
1000. 
1001 
1010 

ioi;l- 
1190 
1161 
1110' 
1111 



0, 
0, 
0, 
0, 
0, 
0, 
0, 



55000 
.39 3Z8 
75843 
•12074 
5^^143 
628 2 9 
69072 
— G^r02 70 
0'. 59091 
9.46400 
X). 8^5417 
0. 29879 

0. 71836 
0. 90525 



0. 70708 
0.15152 

Ol 76568 
0, 1Q6 7 3 
0.65079 
41135 

0a601 
60748 
29545 
75532 
15372 
81762 
48837 



0. 55619 
0. 36438 
0,H2574 



0 
0 
0 
0 
0 
0 
0 
0 

0.92188. 





0. 2234^3 - 

0. 66.493 
0.51794 
0. 79200 ' 
0. 22703 
.0. 66648 ^ 
0. 52018 
0. 79297 
41377^ 
d. 71569 ^ 

•63610 
07;|^h2^9 

0. 58820* 
0. 27451 
0. 74656 
0. 10598 
O4 6876S 
0.44979 
0.80779 
0.11012 
3^ 

452 34^ 
•80868 
32511 
76424 
58465 
85490 



0 

•0 , 
■0 
0 
•0 



^0 .00819 
0.028'90 
.03268 
. 10268 
.09350 
,11035 
. 1012^ 
-0. 12432 
-0.-0r557 
-0.05618 
0.06.120 
-0 . lll^B 
-0.03137 
0 .08226 
0.06226 

. 0.11888 
-0.12300 
0.01911 
0.00075 
-0.03689 
-0.03844 
0#<03994 
-Or&^411 
-0^08166 
r0.156S8 
,05336 
-,0 .T7139 
0.05338 
-b\09628 
0.06698 



SP'RING 
1972 



OOpl 

boip 

OOl'l 

0100 
pioi 
quo 
01 XI 
1000 

lOQl 

101a 
1611 
1100 
iiai 

.1110 

1 111 



0.66009 


0. 


56448, 


. 0 


.09561 


0. 11994 


0. 


22610 ' 


.-0 


.10616 


0'.'764 4 0 


■ 0. 


72965 




.03475 


0. 050 2 3 


-0. 


05181 


. • 0 


.112i>4 


0.58696 * 




62907 


.-0 


.04212 


0. 20213 


0; 


34088- 


— 0 


.'13875 


^ 0. 76159 


0. 


76975 ^ 




.00816 


* Ov 04120 


0. 


058 87 * 




.01767 


0.64029 


' 0. 


67123 


-0 


.03094 


0' 23324 


•Q. 


41579 * 


-0 


.182J55 


0. 81968 


0. 


79592 


p 


.02377 


0.09346 




19 8 4 5. 


-0 


.10499, 


^CL. 744 57 '\ 


' 0. 


71999 


0 


.02457 


0^3Q3 74 


^0. 


50244 


* -^0 


.19870 


0.903 57 


0. 


82618 


0 


.07738 



''TABLE 5 (Continued) 
ACTUAL VS ESTIMATED RETURN. PROBABILITIES 



PATTERN 



ACTUAL 



MODEL 
„ ESTIMATE 



DIFFERENCE 



• 


* Dooi 




,0.4^2073 


0.30254 


0.11818 






0010 




0. 12375 


-0.23935 ;^ 
0.56705 ' 


• 0.36310 






0011 




0.45^57\ 


-0.11648 










0. 02740 


-0.51417 


0.54157 






0101 




0. 30488 


0.47105^ 


-0,.16617 ^, 




SUMMER 


piia 




0. 11220 


0.06007 


0.05212" 




1872 


0111 




0.44752 


^ 0.67165-, 
-0.71941 


-0.22414 






1000 




0^03250 


0.75191 


\ 




^^"^ 1001 




0.34375 


0. 39935"'^ 


-0.05560 




<j * 


, ioio 




0.07692 


-0.06733 


0.14425 


i 


0 


'*1011 




•0. 40244 


0.62715 


-0.2^471 


1 




1100 




, 0.04464 


rO.30401 


0. 34865 - 


/ 




1101 


> / • 


Q. 42857 


. ' ' 0.54447* 


-0.11589 






1110 


0.^0446 


0.19054 


0.01393' 






1111 




0. 63260 


6. 71723 


-0.08462 





■ t 



ERIC 



42 



betwee^ these ^ partitions witli regard to the reenrollment 
» probability and these differences can be logically explained 

As mentioned els.ewhcre , past enrollment behavior may not 

be the immediate causal factor determining future enrollment 

of a student, but the ar^lnlyses- leads one to believe that 

it effectively captures/iind 'summarizes' the total effect 

of all the-Treal , undeil/ying factors. An exception m^y be 

the .affiliate students w^^p da not have- sufficiently long 

pas t' enroj^lmeh-t history to base the prediction on. For ex- 

ample '^f the student was a jiew'-'enrollee last quarter, then 

»^*' , 

the sole information in his past history is thr'at^ Jhe' ^enroll ed 
last quarter. The fact by itself may npt be s'i^jiif f cafit 
enough^to tell^ a lot about his future enrollment. * Possible 
methods xo fortify the'model' iji> this respect are discussed 
ir\ th$5 la^sJbNs^ection of this paper. ^ ' ^ 

^ . JL . , / " ^ ■ " ^ 

'' > ' ' 

r ' 

A nost attractive feature of this approach w^ich mi.ght niore, 
than compensate for any of its weaknesses, is , the fact that 
the for-acas'ting model c&n be oper ationalized with 4;he FCC ' 



* 



students newly enrolled last quarter have the pattern 
"0001". It is interesting to note that in Table 4, it is 
data pert^ning to these points which .were out of .the 
general trend observed there. Also, from Te^ble 5 
seen that the model e£;timatcs the probability of_!!.000l" 
to be about 0.56 in the regular q\;iarters. Th)e proximity 
to 0.5 can be taken as an indication of the high ' var iabili ty 
to be eifpected from this group. 

,43 ' ' ••" 



data available todayj^^^^h^master grade, file kept by tile 
compute r^^^ierrter has information on th'e complete history of 

each^student since the beginning of th-e college. From this 

"i 

file, it is a routine data processing problem to seli^ct 
affiliate sttid^ents whd have been in FCC within the past one 
academic year and then assign ^thein ^'uniquely one of/ the 
fifteen enrollment patterns^. The generation of^ Taple 2 
data frpin the FCC fdles of cou-rse proves the , feasibility 
of this approach. with the pattern^^^stabiished a^d the 
corresponding probability for— return in the neKt/quar-ter , 
computed", an arithmetic sum o£ these pro'babi liti^es would 
give, the expected number of returnees. The act\i.al formu-la 
to be used here is of the form (1). 

V . " • ■ . • • 

One question ar^-ses here. Should we use in (1) th6 probab- 

ilites as ^the^Were observed (Table 2) or should we rather 

' - ' 

use the pro^fcildties as estimates by the model (Table 5) . 

'Bhere -.artt^'some ^good points- about* using the model probabi li'ties 
* / • 

since tlyey have 'smothered out^ random effects in the observed 
probabilities due to sampling errors and other perturbations. 
Also /with the model we need only thirteen parameters to 
^^'enerate rthe 60 probabilities whereas, if we were to use 
thp obl&e^rved probabilities themselves we have ^effectively 
a sixt^ parameter model. Thus, oYie might prefer the model 
over £he actual observations. Hov/ever, strictly speaking, 

• 3- ' 

this should be done only after the model' is further validated 



44 



( -43. 

by future data. Thus at t^is time it is/ not wise to 
discard either possibility but must seek to do the .necessary 
v^^ili^t'io'n with more data. ■/ 



6. FUTHER ViORK 



In the ligh^ shed by- the work done so far and reported in 
th^is interim technic^Upaper , the following tasks seem to 




a. forecasting mode.1. fo*r freshmen requirement^&aaj:s££. The 
tasks are -not ne(^e^sarily listed in their chronological order. 

1) validate the model with data similar to Table 2 
but. pertaining to other quarters, preferably recent 

ones. ' , 

2) Fortily the' model, if necessary, with other eKplanatory 
variables, .enrollment history going back more than 
four quarters, etc. There is mucli possibility Ijere. 

we might cdmpare groups of students comprising the 
partition '1111' with, say, those comprising the 
partition "1010' to discover probable explanatory 
variables. ■ The computer program which generated 
Table 2', is already capable of providing informotion 
sucH as age, sex, marital , status and course histpry 
of any individual- student. ^ The use -of further e>:- 
planatory variable may be especially beneficial to 
•partitions, like "OOOi;, which has a large number of 
. , Students with' little historical enrollment. data, • 



3> Opera tionalize th^ model. Systems for -data collection 

T 

analysis ^nd feeding the model have to be set up • 
The system should be capable of forecasting more 
than one quarter hence and also capable of accepting 
data on 'control variables' (such as number of ^ 
nev; enrollees to be taken into the school i*n the 
coming quarters) and integrate them meaningfully, 
into the forecast. The general mathematics for 
accomplishing this has been established (see Appendix 
A) but remains to be 'particularized' to the final 
forecasting model v;e will be coming up with. 




46 



APPENDIX A 



OPERA^IONALIZING PI^EDICTIVE MODELS 

While discovering predictive variables is often the major 
problem in a forecasting task, designing' an operational 
system to use the prediction scheme is of no less importance; 
From the operational point^o f- viev; there are three -aspects 
to be consi-dered. , "First, -one must determine the means for 
measuring the predictor variables from available ' data and 
design the information sy stem-^yhich would regularly supply 
•the necessary da^ta. This tas)c largely depends on the second 
and centtal aspect Qf the problem, namely constructing the^ 

« 

ijathematics which vould let one take the. data and transform. 

it into the predictor variable and then into the forecasts 

for one'or more- future periods. The third aspect to be 

considered is that of monitoring the system to detect those 

changes vhich would oblige us to modify the forecasting 

•r 

scheme or shifts in model parameters. 

THe forecasting models of interest In this study proceed by . 

partitioning the set S of affiliate student population into 

'subsets S^, and computing the probability r^ , 

r r to be associated with these sub sets. The forecast 

2 ' • * * m ^ 

is then given by r^ ^-^ ^2 ^2 " \ "^^^^ , 

s^ = number of student in S^. Given S^, S^, " -^^ the fore- ^ 

casting task is si^ii^le-. But the availability of this inform- 

- . -45- 

47 - 



-46- 

ation cannot be always taken for granted. The forecast 
may need to be made at a time v;hen the data required to compute 
one or more of the S s are not yet available. This is 
invariably the case when one* is trying to fprecast for a 
number of future periods using a model which needs the S^; 
variables .for the immediately preceding period. The mod^l using 
past enrollment pattern aa a predictive variable diseased 
in the text is a good example. To predict the number of re- 
turnees from the affiliate student population for q'uaxter 
X, v/e^ need tp classify each student' in one of fifteen categories 
based on has enrollment history in the four* quarters (i-4) , 
(i-3)/ (i-2) and (i'-^l) . Therefore, if we are required to fore- 
cast for quaxt:er (i+1) as well as quainter i- at the same time, 
strictly speaking we will.r^^eed *enrollinejit data for quarter i 
which of course would not be available to us. - " 

A natural way .to deal with this situation is to find a -means 

for forecasting t/he predictive variable si,... s^^ for period 

(£+1) from the knowledge of these variables for. "the period 

1. In other words, if s^ (i)# s^ (i)/... ^m^-^^ stand for. the 

predictor variables in period (i), then v/h'at w*e need is a 

set of transform f , / f^f...f such that:'" 

1 ^ . m 



,si (i-?l) = fj^ (s^ U), 52 (i),...Sj^ (i) ) 

* 

(i + 1) =' f 2 ( , (p.) , . . .Sj^ (i) ) 



-47 « 



le above equations, may be called the 'system Dynamics 
Equations' since in a sense they describe the movement of the 
system from one period to the next. Quite often in order t^O 
completely specify the dynamics, it is necessary to introduce 

For example, to specify, ^ 



variables other than / • • 
th-e number of students in the category '0001' f6r 



quartj 




(i + 1) we must know the new enrollees to be a^mi^ed into 
the school in the quarter i* . Varj^aMes like these' needed 
to 'completely describe the sVstemr,,dynamics may be called 
"input variables" and denot<id by {e^, e^ ,^^jj^^^ of 
tlife=.i^^ptf^''^varlables may be determined by decision ma)^ 
while, others may be-purely dependent on the>^ate of nature* 
including the input varial^les, ihe system dynamics equatioiis 
assume the yeatox form' ' > 

- £{i + l) =^ £ (l.(i)/£) • 

•Where s(i-M) and s{i) are- Ko^>^^^tors ,^ f is a Mxl vector 
if unction and e is a ~nxl veix.br- * Onej)f ^the prime ne^eds for 
forecasting for a number of periods in therefore the abili1:y 
-t^^redict the input variables £ also. 

II; is intcrstin^g"^ note hero that tho model using fjlHil 
quarter enrollment history as a' predictor- variable has an 
associated set of syst.em -dynamics equations that are quite 
simple. T^e only imput variab].e needed for completing the 
..dynamics equations is the number of'fresh entrants into th'e 



5.: 



49 







'affiliate population in 'the gtfarters of interest. Thi.s 
variable is to a certaij^ Extent a control variable and should 
not prove difficult-^ to assign values toi 



Designing an inf ormatiQj:i^'Sy"strein'""to^ px-e-^ \ 

dictive model with necessary, basic data is not very difficult 
either. What would essentially be ne*eded is a computer progr^ 
to f agai^ist "the current grade file to classify all apellate 
St ts (or at least those who have enrolled ip^ne- or 
qu. *wers in the past one year) into one^,.jo€M:he fifteen.^ 
and count the number of studenj^^^-ln^ea^-^roup^ eomputer 
program exists eVen now^^^^nC ,Lt-'^"h'ould ^e^"osj>i5le. to maj 
it \ ' Prodnction^^^Og'ram* f o rj^e-^uj^a^r"^^ \^ith^JW:€t le or 
no modi fi^i^i*^^^ vjjPhTT +- a and^tie estimates of 



reslT inputs 'e}y^.^<ted ^^o^^^ quarters, forecasting 

number of 'quarters can be made. 

*• > 

The third aspect mentioned also xSeserves attention* The 
forecasting metho^d ^is only as good as^ the .parameter s' usea . 
Vn coll text r of oufi particular model, the i^arameters are the 
return pro}?abi3?€ties f or the "various classifications* ~It - ---^ 
i^ nQcessar.y ' that a cb.ntinuous cHeck be made with current 
data to s^e .if ^the parameter values used are indeed valid* ^ 
For key 'parameters r quality control charts or ^othcr statistical/ 
graphical techniques may "be' used to do this monitoring. 



ERLC 



\ • 



50' 



is" feasible to integrate this aspect wi^h the totaX 
forecasting' system, so that as forecasts are made/ and 
compared with the actual, the current parameteri, values 
are evaluated and^ tested for possible'^ 'signif ica^nt shifts. 



One of the tasks in th^ next phase 

^ ' — 




uld be 

to build ,such an integr^ix^^^^^recasting system out of' the 
basic mo^l dev^l-o^ed in this paper. 



V 



4 



51 



FITTING AN ADDITIVE MODEL TO 
LOGARITHM OF REENROLLMENT PROBABILITIES 

* * # • 

If p its .the p.):oportion of students v;ith a partic^ar past 

enrollmet pattern who reenroll in quarter yf then the model 

hypothesizes tiiat: 

In(l-p) = m + b^^ + + b2k + ^li^'^ ^ ^ 



where each of the indices^' i , j , k , 1 take one of the four values 

1 f 2 , 3 r 4^^ according to the following rule stated for b * 

4i" 

"* * " ^ 

XI) i takes the value^ 1 (i.e., h^^ is the variable b,,) 

4i 41 

* i. - , • 

if the students had enrolled 4- quarters ago in a regular 
(non-suininex) quarter. 

(2) i takes the'value 2 ( i . e . , b^-^ represents b ysif the 
students had not enrolled four quarters ago in a regular 
quarter. ' 



(3) -' i takes the vaf'ue 3 (i.e., b^^j^ is b^^).if the students 
had enrolled four quartre.rs ago which was a' summer quarter. 

(4) i is 4 (i.e., b4i is b^^)^ if the 'st.tidents had not 
enrolled 'four quarters ago in a summer quarter. 



-50- 

52 



"The above rule .is also applicable to . with the difference 

31 

that the enrollment behavior three quarters ago (instead 
of four quarters ago) is considered to (JJ^termine the appropriate 
-index. Similarly '^2'^ ^1 determined based on 

enrollment two and one quarter prior to quarter i. ' ' 

J\s an illustration, i^^ we are condidering the jreenrollment 
proba'bilty p of students ^with enrollment patter '1011' in 
a Summer quarter , then the model suggests that In (3.~p) is 

given by • ^ , - ^ V ' " * 

, >■ • 

the random variation due to e having been ignored.. 

. - • • ■ ■•J ' 

We can similarly write down the expression- j:6'r ciny -of the 
15 enrollment pai:terns and four- qu-arters for whic^ data is 
available. ^ ' * ' 

In this model there are seventeen parameters ^^ncluding ^ ^ ' 
and the sixteen b variables. In .fche'least square method, i 

these parameters are estimated*^ from the 'data so as to minimize 

i ^ ♦ " > ' ^ 

*. * ^ 

the sum of squares of -deviation of the observed In (l^-p) and 

estimated In'(l-p) for the sixty observations we bave. 

Since p is gi^yen by (1) in terms of the parameters, this 

method anpcJunts to minimizing: 

' 60 - . 2 

. „ ^ f (P - m - b -b3 b - b^^) 



1 • 



'SI- 



'S, . 



with respect .to them, b parameters.' The summation is 
over the sixty observations, the appropriate parameters 
^4i'^3j' ^2k' ^IL ^^^"^ determined according fe<^ the rules 



qiven before for each observatioa. "The p^^meter values 
minimizing , L are obtained by solving the equations resu-lting 
from setting. the the partial .derfv'ati ve of L'-with' respect ' 
tc/e^ch parameter to zero. The equations are, (chilled normal 

quations) all linear and are exhibited" in I^^ble B-1 

\ ' ' , . ^ 

in matrix fo^'in. > 

The. 17 by '17 mat^rix is singular' and in fac^t has- rank 13 ^ 
as can,be^s^;en by th'e fact that (1^^ + ^22''^23'' ^24^ *" in= (b^i^ . . bi4 ) 
This means, in terms of parameters estimation that we-.can 
' give "^f our parameters any arbitrary valufes and then/ 

d/ter.nine the rest. Acoor/i^ingly b^,, b2^V-b3^, b^^ (repap^sent ing 
/' ■ • ' . ■ ■ . « 

■ e-ffect o'f drop'out in Summer quarters > we-re all set to zero 

^nd rer-.ovcd from the equations. Equations 4^0, i2, an/^^-^ 
c'oresponding to these^ariabj^^s were al so. dropped^^^/^ 
'remairiinq 13x13 maJzA^ is' /^on-singu-lar and was s/lved using . 
' -a C^LL OS/36q ri^^ine callcd'^SIMQ available i/pcc. library . 
The estimated parauifcter valuos are.: . / 



b ' = 
11 


-.648887 ' 




= 7.070537 




.40204 


'32 


= . .20600^/ 


^13 


.840544 
-.214 8 26 - 


^3 


= -.,J.60537 


b - 

21 




= -.54608 

- - - - - » 


b -'^^^ 
22 


.261998 




= -.■264''095 




-.485431 


-.179431^ 


= -,149'429 


ha " 


M = 





